Supplementary Material for “RGB-Infrared Cross-Modality Person Re-Identification”

نویسندگان

  • Ancong Wu
  • Wei-Shi Zheng
  • Hong-Xing Yu
  • Shaogang Gong
چکیده

This supplementary material accompanies the paper “RGB-Infrared Cross-Modality Person Re-Identification”. It includes more details of Section 4, as well as extra evaluations of our proposed deep zero-padding method. 1. Details of Counting Domain-Specific Nodes In the third paragraph of Section 4.2 in the main manuscript, we quantify the number of domain-specific nodes in the trained network in our experiments. As defined in Equation (3) in Section 3 in the main manuscript, the categorization of node types is rather strict. In the l-th layer, let η i denote the i-th node and fout(x , i, l) denote the output of η i given the network input x. Let x d1 and x (0) d2 be inputs of the whole network of domain1 and domain2, respectively. The type of node η i is defined by type(η (l) i ) =  domain1− specific, fout(x d2 , i, l) ≡ 0 domain2− specific, fout(x d1 , i, l) ≡ 0 shared, otherwise. (1) Since the identity sign is used here, the categorization condition is too strict in applications. So we relax the categorization condition for counting towards domain-specific nodes in application by setting a threshold T . The relaxed definition of node type is formulated as follows: for all x d1 and x d2 in our experiments, type(η (l) i ) =  domain1− specific, fout(x d2 , i, l) < T and fout(x (0) d1 , i, l) > T domain2− specific, fout(x d1 , i, l) < T and fout(x (0) d2 , i, l) > T shared, otherwise. (2) Because the scales of responses on feature maps differ from layer to layer, we set T = α std(x i ), where α is a proportion coefficient, x i is the output value of the i-th node in the l-th layer and std(·) is the standard deviation function. For an image channel in our experiments, we compute the average of all values in the feature map as the output of the node. We set α = 0.01 and α = 0.05 for strict and loose categorizations, respectively. The relation between the proportion of domain-specific nodes and layer depth is shown in Figure S1. Both total proportions and respective proportions of two domains are shown. With strict threshold, domain-specific nodes mainly exist in the first three layers. With loose threshold, domain-specific nodes mainly exist in the first five layers. In both cases, the network can learn more domain-specific nodes using deep zero-padding. When the threshold is loosened, the proportion of domain-specific nodes increases when using deep zero-padding, but keeps nearly unchanged when using the inputs without zero-padding. 2. Evaluation on Using Different Networks Our deep model is based on ResNet [1] as illustrated in Section 5 in the main manuscript. Deep zero-padding has shown effectiveness on ResNet-6 in our experiments. To verify whether deep zero-padding can also work with other one-stream networks, we also evaluated our method on popular architectures AlexNet [2] and VGG-16 [3]. The results are reported in Table S1. Generally, using deep zero-padding can improve the performance in most cases for all evaluated network architectures. The improvement is especially evident for ResNet-6.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Person Depth ReID: Robust Person Re-identification with Commodity Depth Sensors

This work targets person re-identification (ReID) from depth sensors such as Kinect. Since depth is invariant to illumination and less sensitive than color to day-by-day appearance changes, a natural question is whether depth is an effective modality for Person ReID, especially in scenarios where individuals wear different colored clothes or over a period of several months. We explore the use o...

متن کامل

Volume-based Human Re-identification with RGB-D Cameras

This paper presents an RGB-D based human re-identification approach using novel biometrics features from the body’s volume. Existing work based on RGB images or skeleton features have some limitations for realworld robotic applications, most notably in dealing with occlusions and orientation of the user. Here, we propose novel features that allow performing re-identification when the person is ...

متن کامل

Algorithms for People Re-identification from Rgb-d Videos Exploiting Skeletal Information

In this thesis, a novel methodology to face the people re-identification problem is proposed. Re-identification is a complex research topic in Computer Vision representing a fundamental issue, especially for intelligent video surveillance applications. Its goal is to determine the occurrences of the same person in different video sequences or images, usually by choosing from a high number of ca...

متن کامل

One-Shot Person Re-identification with a Consumer Depth Camera

In this chapter, we propose a comparison between two techniques for oneshot person re-identification from soft biometric cues. One is based upon a descriptor composed of features provided by a skeleton estimation algorithm; the other compares body shapes in terms of whole point clouds. This second approach relies on a novel technique we propose to warp the subject’s point cloud to a standard po...

متن کامل

Learning Efficient Image Representation for Person Re-Identification

Color names based image representation is successfully used in person re-identification, due to the advantages of being compact, intuitively understandable as well as being robust to photometric variance. However, there exists the diversity between underlying distribution of color names’ RGB values and that of image pixels’ RGB values, which may lead to inaccuracy when directly comparing them i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017